No-Regret Learning from Partially Observed Data in Repeated Auctions
نویسندگان
چکیده
منابع مشابه
No-Regret Learning in Repeated Bayesian Games
Recent price-of-anarchy analyses of games of complete information suggest that coarse correlated equilibria, which characterize outcomes resulting from no-regret learning dynamics, have near-optimal welfare. This work provides two main technical results that lift this conclusion to games of incomplete information, a.k.a., Bayesian games. First, near-optimal welfare in Bayesian games follows dir...
متن کاملOnline learning in repeated auctions
Motivated by online advertising auctions, we consider repeated Vickrey auctions where goods of unknown value are sold sequentially and bidders only learn (potentially noisy) information about a good’s value once it is purchased. We adopt an online learning approach with bandit feedback to model this problem and derive bidding strategies for two models: stochastic and adversarial. In the stochas...
متن کاملNo-regret Learning in Games
The study of learning dynamics in strategic environments has a long history in economic theory. Many different classes of learning algorithms have been considered in the literature and some have been shown to converge to equilibrium under certain conditions. In this note, I focus on a particular class of learning processes, called no-regret learning. While the no-regret framework was originally...
متن کاملEstimating survival from partially observed data
Often, in cross-sectional-follow-up studies, survival data are obtained from prevalent cases only. This sampling mechanism introduces lenght-bias. An added difficulty is that in some cases the times of the onset cannot be ascertained or are recorded with great uncertainty. Such was the situation in the Canadian Study of Health and Aging (CSHA), an ongoing nation-wide study of dementia conducted...
متن کاملEfficient No-Regret Multiagent Learning
We present new results on the efficiency of no-regret algorithms in the context of multiagent learning. We use a known approach to augment a large class of no-regret algorithms to allow stochastic sampling of actions and observation of scalar reward of only the action played. We show that the average actual payoffs of the resulting learner gets (1) close to the best response against (eventually...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IFAC-PapersOnLine
سال: 2020
ISSN: 2405-8963
DOI: 10.1016/j.ifacol.2020.12.029